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Abstract: Glycosylation is one of the most common post-translational modifications in 
eukaryotic cells and plays important roles in many biological processes, such as the 
immune response and protein quality control systems. It has been notoriously difficult to 
study glycoproteins by X-ray crystallography since the glycan moieties usually have a 
heterogeneous chemical structure and conformation, and are often mobile. Nonetheless, 
recent technical advances in glycoprotein crystallography have accelerated the accumulation 
of 3D structural information. Statistical analysis of "snapshots" of glycoproteins can 
provide clues to understanding their structural and dynamic aspects. In this review, we 
provide an overview of crystallographic analyses of glycoproteins, in which electron 
density of the glycan moiety is clearly observed. These well-defined /V-glycan structures 
are in most cases attributed to carbohydrate-protein and/or carbohydrate-carbohydrate 
interactions and may function as "molecular glue" to help stabilize inter- and intra-molecular 
interactions. However, the more mobile /V-glycans on cell surface receptors, the electron 
density of which is usually missing on X-ray crystallography, seem to guide the partner 
ligand to its binding site and prevent irregular protein aggregation by covering 
oligomerization sites away from the ligand-binding site. 
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Abbreviations: GlcNAc, /V-acetyl-D-glucosamine; Man, D-mannose; Gal, D-galactose; Sia, 
sialic acid; Glc, D-glucose; Fuc, L-fucose; GnT, /V-acetylglucosaminyltransferase; CHO, 
Chinese hamster ovary; HEK, human embryonic kidney; IgG, immunoglobulin G; Fc, 
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crystallizable fragment; ADCC, antibody-dependent cellular cytotoxicity; FcRn, neonatal 
Fc receptor; NA, neuraminidase; APA, Antheraea pernyi arylphorin; Tr-P-gal, 
Trichoderma reesei [3-galactosidase; TLR, Toll-like receptor; ICAM, intracellular 
adhesion molecule 

1. Introduction 

Glycosylation is a common and highly diverse modification of proteins and occurs during or after 
protein synthesis [1]. More than 50% of eukaryotic proteins are glycosylated [2]. Glycosylation 
profoundly alters the behavior of proteins, making them more soluble, protecting them from 
proteolysis, covering antigenic sites, and altering the orientation of proteins on cell surfaces. The 
importance of protein glycosylation is becoming widely recognized through studies on protein 
localization and trafficking, biological half-life as well as investigations of cell-cell interactions. 
In almost all glycoproteins, the carbohydrate units are attached to the protein backbone either by N- or 
O-glycosidic bonds or both. iV-glycans are covalently attached to proteins at the amide of asparagine 
(Asn) residues, forming an jV-glycosidic bond. The consensus sequence for 7V-glycosylation, called a 
sequon, is Asn-X-Ser/Thr, where X can be any amino acid except proline. In O-glycosylation, the 
glycan is attached to the side chains of serine or threonine residues. Unlike TV-linked glycosylation, no 
consensus sequence defining an O-linked glycosylation site has been reported. 

Figure 1. (a) Representative chemical structures of high-mannose and complex-type 
iV-glycans. (b) iV-glycan processing pathways in mammalian cells. The enzymes and 
structures of intermediate A^glycans are shown. Glc I, a-glucosidase I; Glc II, 
a-glucosidase II; ER Man, ER a-mannosidase; a-Manl, a-mannosidase I; GnT I, 
[3-jV-acetylglucosaminyltransferase I; a-Manll, a-mannosidase II; GnT II, 
P-iV-acetylglucosaminyltransferase II; [34GalT, P-l,4-galactosyltransferase; SiaT, 
sialyltransferase; GnT III, P-iV-acetylglucosaminyltransferase III; GnT V, 
P-A^-acetylglucosaminyltransferase V; and al,6-fucosyltransferase, Fut8. 
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Figure 1. Cont. 
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Human TV-glycan is typically composed of A^-acetyl-D-glucosamine (GlcNAc), D-mannose (Man), 
D-galactose (Gal), sialic acid (general name for TV-acetylneuraminic acid, which can be abbreviated to 
Sia or Neu5Ac), D-glucose (Glc) and L-fucose (Fuc) residues. TV-glycan is classified into three groups: 
high-mannose type, hybrid type, and complex type. Representative chemical structures of high-mannose 
(Man9GlcNAc2) and biantennary complex-type (Gal2GlcNAc2Man3GlcNAc2) glycans are shown in 
Figure la. Both types contain Man3GlcNAc2 core structures. In high-mannose type glycan, 
Manal-2Manal-2Man, Manal-2Manal-3Man, and Manal-2Manal-6Man chains are designated the 
Dl, D2 and D3 arms, respectively. In biantennary complex-type glycan, the al-3, and al-6 branched 
oligosaccharide chains (Gaipi-4GlcNAc[31-2Man) are termed al-3 and al-6 arms, respectively. 
TV-glycan precursor (Glc3Man 9 GlcNAc2) is assembled on a lipid carrier, dolichylpyrophosphate 
(Dol-PP), and is transferred onto a polypeptide chain by oligosaccharyltransferase (OST) in the 
endoplasmic reticulum (ER) lumen (Figure lb). The attached TV-glycan is sequentially processed by 
various glycosyl-hydrolases and -transferases. Three glucose and one mannose residues are 
immediately removed by glucosidase I, II and ER a-mannosidase. Removal of glucose residues is 
closely related to the folding of glycoproteins. Further processing of TV-glycans occurs along the 
secretory pathway as the properly folded glycoprotein moves through the Golgi apparatus to its final 
destination. The initial stage of the pathway consists of several trimming steps by a-mannosidase I that 
generate a key intermediate, Man5GlcNAc2. In the Golgi apparatus, the conversion of high-mannose 
type to hybrid-type glycan is initiated by the action of iV-acetylglucosaminyl transferase I (GnT I), 
which transfers GlcNAc, in a pi-2 linkage, to the al-3 arm mannose residue of Man5GlcNAc2 
substrate. Hybrid-type glycan is further remodeled by a series of enzymes to yield complex-type 
TV-glycans. There are several pathways which increase the heterogeneity of TV-glycans, including 
Af-acetylglucosaminyltransferase III (GnT III), V (GnT V), and al,6-fucosyltransferase (Fut8) 
(Figure lb). GnT III catalyzes the addition of GlcNAc via a pi-4 linkage to the [3-Man of the mannosyl 
core of A^-glycans. This GlcNAc is designated as "bisecting GlcNAc" and is involved in the 
suppression of cancer metastasis [3]. GnT V catalyzes the addition of a (31-6 linked GlcNAc unit to 
al-6 linked Man of the trimannosyl core of TV-linked glycans to form tri- or tetra-antenary branches [4]. 
Fut8 introduces a fucose residue onto the innermost GlcNAc of the TV-linked biantennary complex-type 
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oligosaccharides via an a 1-6 linkage [5] (Figure lb). This reaction is denoted as "core-fucosylation" 
and the Fuc residue is termed "core-fucose". Each glycosyltransferase has unique substrate specificity 
that determines the iV-glycan structure. For example, the action of GnT V and Fut8 is inhibited by the 
presence of a bisecting GlcNAc residue. Through these modifications, complex-type glycan is often 
branched and contains a trimannosyl core, several GlcNAc, Gal, sialic acid and Fuc residues, resulting 
in a high-level of complexity (Figure lb). 

3D structural information on glycoproteins helps in the understanding of iV-glycan function. 
However, the inherent flexibility of glycan hampers X-ray crystallographic analysis. Actually, diffraction 
quality crystals of glycoproteins are normally only obtained following either the abolition of the 
glycosylation site by site-directed mutagenesis or enzymatic deglycosylation treatment [6-8]. The 
number of glycoproteins resolved by X-ray or NMR represents less than 3% of the total number of 
reported 3D structures in the Protein Data Bank (PDB) [9]. However, significant improvements in 
experimental methods have led to an increase in the number of solved glycoprotein structures with 
well-defined carbohydrate residues. Various methodologies using mammalian expression systems have 
been developed for the production of large amounts of homogeneous glycoproteins. Chinese hamster 
ovary (CHO)-lec 3.2.8.1 cells [10] and human embryonic kidney (HEK) 293S GnT I deficient cells [11] 
are suitable for producing glycoproteins for crystallization. Since both cell lines lack GnT I activity, 
uniform glycoprotein can be produced with the truncated TV-linked oligosaccharide, MansGlcNAca 
(Figure lb). Moreover, co-cultivation of mammalian cells with a iV-glycosylation inhibitor, 
kifunensine or swainsonine, reduces the chemical heterogeneity of the product glycoform and can be 
readily deglycosylated with endoglycosidase (Figure lb, [12]). Other eukaryotic expression systems, 
such as yeast and insect cells, are also available for producing homogeneous protein for 
crystallography. However, in fungal and insect expression systems, mature TV-linked glycan is 
generally of the heterogeneous high-mannose type, whereas human iV-glycans are mainly hybrid or 
complex types. Since glycan structure depends on the expression cell type, the relationship between 
glycoform and its physiological function needs careful inspection. 

Previous statistical analyses of the available X-ray diffraction data on oligosaccharides [9,13,14] 
identified the energetically-favorable conformations for individual sugar linkages. 
GLYCOSCIENCES.de web portal [15] and the Glycoconjugate Data Bank [16] offer convenient ways 
to search for carbohydrate structures in the PDB. In this review, we introduce several examples in 
which the glycans affect intra- and inter-molecular interactions. The glycans are often stabilized to 
assume highly ordered structures via extensive protein-carbohydrate or carbohydrate-carbohydrate 
interactions. We also introduce several cell-surface glycoprotein structures where the glycans seem to 
assist in proper ligand recognition and to prevent protein aggregation, although many of these glycans 
have disordered structures. 
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2. Conserved Glycan Mediates Inter- Subunit Interactions of Proteins (Fc Fragment and 
Influenza Neuraminidase) 

2.1. N-Glycan on Fc Fragment Affects Intra- and Inter-Molecular Interactions 
2.1.1. Overview of iV-Glycan of Fc Fragment 

Immunoglobulins are Y-shaped glycoproteins that participate in the adaptive component of the 
immune system which is directed against extracellular pathogens. The structural diversity of their 
antigen binding sites allows them to bind specifically to millions of structurally unique molecules. 
Immunoglobulin G (IgG) consists of two light and two heavy chains and comprises three independent 
parts connected through a flexible linker or hinge (Figure 2a). Two of these, the Fab fragments, are 
identical in structure, each with an antigen-specific binding site. The third one, the Fc fragment, has a 
highly conserved structure even between different isotypes [17-20]. It has antibody effector functions 
such as antibody-dependent cellular cytotoxicity (ADCC) and complement-dependent cytotoxicity 
(CDC) through its interaction either with lymphocyte receptors (Fc receptors, FcyRs) on effector cells 
such as natural killer cells or with the Clq component of complement. The Fc fragment is a dimer and 
displays a horseshoe-like arrangement of two antiparallel [3-sandwich domains, named Ch2 and Ch3, 
connected by a short flexible linker (Figure 2a). Interaction between subunits is mainly through the 
pair of Ch3 domains. These two domains form a compact, non-covalent dimer with a buried surface 
area of -1,000 A 2 . A biantennary complex-type glycan is attached onto each Asn297 in the Ch2 
domains [17]. Characteristics of Fc iV-glycosylation include a low incidence of monosialylation, no 
disialylation, little bisecting GlcNAc, a high incidence of core fucose and heterogeneity of galactose 
residues [21]. The particular glycoform of the Fc fragment impacts on its physiological function. 
Removal of fucose is known to enhance ADCC activity both in vitro [22] and in vivo [23]. Moreover, 
the galactose content of human IgG-Fc correlates inversely with disease progression in rheumatoid 
arthritis and other auto-immune diseases [24]. The anti-inflammatory activity of intravenous Ig (IVIG) 
can be recapitulated with a fully recombinant preparation of appropriately sialylated IgG Fc 
fragments [25]. Thus, manipulation of Asn297 glycan structures has emerged as a strategy to modulate 
effector functions of therapeutic antibodies [26,27]. 

Pioneering X-ray crystallographic studies of the isolated human IgGl Fc domain [17] and rabbit 
IgG [28] have shown that the two conserved iV-glycan chains of the Fc are well defined, with the 
oligosaccharide bound to the surfaces of the Ch2 domains. Since then, our structural knowledge of Fc 
fragments and their glycans have grown substantially. As of March 2012, there are more than 30 PDB 
entries. The structures containing glycosylated Fc fragments are summarized in Table 1 . A representative 
structure of human IgGl Fc and its iV-glycan is shown in Figure 2b. The al-6 arm of the iV-glycan of 
Asn297 contacts two hydrophobic residues, Phe241 and Phe243, and makes several hydrogen bonds 
with Lys246, Asp265, and Arg301 (Figure 2b). A core Fuc is located in the vicinity of Tyr296 and 
indirectly affects the hydration mode of Tyr296 [29]. In contrast, Tyr313, which corresponds to 
Tyr296 in human IgGl, makes direct hydrogen bonds with a core Fuc residue in the mouse 
IgG2a structure [19]. 
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Figure 2. (a) Overall structure of immunoglobulin G (PDB code; ligt) is shown in a ribbon 
model. One light and two heavy chains are shown in beige, blue and cyan, respectively. 
Carbohydrate residues attached on the Fc region are shown in sphere models, (b) Close-up 
view of Asn297 attached glycan of human IgGl Fc (PDB code; 2dts). Carbohydrate moiety 
and amino acid residues which interact with 7V-glycan are shown in the rod model. 
Hydrogen bonds between protein and carbohydrate are shown as red dotted lines. 
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Table 1. Summary of Fc fragment structures. 
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Glycan structure * 



Resolution Reference 
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lh3u M3GN2F 
lh3x GN2M3GN2F 
lh3v G2GN2M3GN2F 
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lh3y G2GN2M3GN2F 
Human IgGl Fc produced by CHO or Fut8' 7 ' CHO cells 
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GN1M3GN2 (chain-A) 
GN2M3GN2 (chain-B) 
Human IgGl Fc fragment triple mutant (M252Y/S254F/F256E) 
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Human IgGl Fc fragment triple mutant (L234F/L235E/P331S) 
3c2s G1GN2M3GN2F 
Protein is produced by HEK293 cells. 
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Table 1. Cont. 



PDB ID Glycan structure * Resolution Reference 

Human IgGl Fc fragment triple mutant (S239D/A330L/I332E) 

2qll GiGN 2 M 3 GN 2 F 2.5 [35] 
Human IgGl Fc fragment + 13 residues peptide 

GiGNiM 3 GN 2 F (chain-A) 2.7 [36] 
n GiGN 2 M 3 GN 2 F (chain-B) 
Human IgGl (Rituxan) Fc fragment + Staphylococcus aureus Protein A domain B 

116x G2GN2M3GN2F 1.65 [37] 
Human IgGl Fc fragment 

3do3 G1GN2M3GN2F 2.5 [38] 
Human IgGl 

lhzh G2GN2M3GN2F (chain-H) 2.7 [39] 

G1GN2M3GN2F (chain-K) 

Mouse IgG2a 

ligt G1GN2M3GN2F 2.8 [19] 
Human IgGl Fc fragment + Protein- A mimetic peptide dendrimer. 

3d6g GN2M3GN2F 2.30 [40] 
Mouse IgG2b Fc fragment 

2rgs GN2M3GN2F 2.1 [41] 
Rabbit IgG Fc fragment 

2vuo G1GN2M3GN2 1.95 [42] 
Human IgGl Fc fragment + minimized protein A 

loqo GN2M3GN2F 2.3 [43] 

loqx M 3 GN 2 F 2.6 
Human IgG Fc fragment 

lfcl G1GN2M3GN2F 2.9 [17] 
Rat IgG2a Fc fragment 

lilc GN 2 M 3 GN 2 F 2.7 [18] 
Human IgGl Fc fragment + human Fc receptor (FcyRIIIb) 

G 1 GN 1 M 3 GN 2 F (chain-A) 3.0 [44] 
1 S) GN2M3GN2F (chain-B) 

G 1 GN 1 M 3 GN 2 F (chain-A) 3.5 
( UXj GN2M3GN2F (chain-B) 
Human IgGl Fc fragment + human Fc receptor (FcyRIIIb) 

le4k G1GN2M3GN2F 3.2 [45] 
Human IgGl Fc fragment + human Fc receptor (FcyRIIIa) 

3sgj GN2M3GN2F 2.2 [46] 

3sgk GN3M3GN2 2.4 
Human IgGl Fc fragment + human Fc receptor (FcyRIIIa) 

3ay4 GiGN 2 M 3 GN 2 2.2 [47] 
Human heterodimeric Fc + human neonatal FcR (FnRn) 

lila GN 2 M 3 GN 2 F 23 [18] 

* Glycan structures deposited in each PDB coordinate file are shown. G: D-galactose, 
GN: 7V-acetyl-D-glucosamine, M: D-mannose, F: L-fucose. 
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Previous comparative conformational analysis has demonstrated that both glycosylated and 
non-glycosylated Asn side chains exhibit j\ (N-Ca-C[3-Cy) torsion angles of -60°, 60°, 180°, 
corresponding to the g~, g+ and t conformers, respectively. The /2 (Ca-C[3-Cy-N8) torsion angle 
shows a wide range centered at about 0°, but is nevertheless limited by glycosylation. In the case of 
glycosylated Asn side chains, the g- conformer is most preferred and the g + conformer rarest in both 
glycosylated and non-glycosylated Asn residues [9,48]. The conformation of 56 Fc iV-glycans is 
statistically analyzed in Figure 3, from the 29 different Fc structures listed in Table 1. The side chain 
torsion angles, ^1 and #2, of Asn297 are plotted in Figure 3a. In the case of Fc structures, the y\ 
torsion angle of Asn297 exhibits a marked preference for -60°, corresponding to the g + conformer. 
The torsion angles, 0and y/, for all glycosidic linkages are plotted in Figure 3b. All the dihedral angles 
of the glycosidic linkages are within an acceptable region compared with other glycoproteins [48]. The 
dihedral angle distribution of the a 1-6 arms appears to be more restricted than that of the a 1-3 arms 
(Man-4-Man-3, Man-4'-Man-3, GlcNAc-5-Man-4, and GlcNAc-5'-Man-4' in Figure 3b). This 
difference derives from the fact that the a 1-6 arms of the glycan interact with the first two strands of 
the Ch2 domains. Solution NMR analyses of the Fc indicate that the al-6 arm is completely 
immobilized through interactions with the protein surface, whereas the a 1-3 mannose terminus is more 
dynamic [49,50]. On the other hand, spin relaxation NMR studies indicate that Fc glycan could exhibit 
a dynamic conformational state as well as the fixed state as seen in the crystal structure [51]. 

Figure 3. (a) The side-chain torsions of Asn297 of the Fc fragment. Torsion angles of the 
Asn297 side chain are measured by MolProbity [52]. We excluded Fc with high-mannose 
type glycan (PDB code 2wah) from this inspection, since this structure contains 
high-mannose type glycans. (b) Comparison of glycosidic torsions of iV-glycan attached on 
Asn297 of Fc fragment. Dihedral angles of each linkage are calculated with CARP [15]. 
The vertical and horizontal axes indicate (p and (p angles, respectively. The residues with 
errors are carefully excluded from this analysis. In many cases, a (31-6 linkage is 
erroneously used between core Fuc and GlcNAc instead of an al-6 bond [53]. Eight entries 
are plotted in Fuc-GlcNAc-1 (PDB code; lh3w, 3ave, 3d6g, 2rgs, chain-A in le4k, and 
chain-B in 3sgj). 
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Figure 3. Cont. 
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2.1.2. Glycoform Affects the Relative Interdomain Angles of the Fc Fragment 

Glycan structure can potentially affect the overall structure of a glycoprotein. The influence of 
glycoform on the conformation of the Fc fragment has been extensively investigated. Two papers 
report on the relationship between glycoform and the interdomain angles of the Ch2-Ch3 domains. 

In the first report, the influence of glycoform on the structure and function of IgG Fc was assessed 
by sequential exo-glycosidase treatment [31]. Krapp et al. solved the crystal structures of human IgGl 
Fc of four glycoforms bearing consecutively truncated oligosaccharides (PDB code; lh3t, lh3u, lh3x, 
lh3v and lh3w). Removal of the terminal GlcNAc as well as the mannose residues causes the largest 
conformational change in both the oligosaccharide and in the polypeptide loop containing the 
7V-glycosylation site. The conformational change in the Ch2 domain affects the interface between the 
IgG-Fc fragments and the FcyRs. Moreover, removal of the sugar residues permits the mutual 
approach of the Ch2 domains and the generation of a closed conformation. This contrasts with the 
open conformation of fully galactosylated IgG Fc, which may be optimal for FcyR binding. Solution 
NMR analysis of a series of Fc glycoforms also indicates that the carbohydrate moieties are required 
for maintaining the structural integrity of the FcyR binding site [54]. 

In the second report, Crispin and co-workers solved the crystal structure of a recombinant human 
IgGl Fc fragment designed to possess high-mannose type glycan at Asn297 [30]. Recombinant Fc was 
transiently expressed using HEK 293 T cells in the presence of the a-mannosidase I inhibitor, 
kifunensine (Figure lb). The glycan structure was confirmed to be Man9GlcNAc2 by MALDI-TOF-MS 
and the crystal structure of the glycoform was solved. The electron density map of the high-mannose 
glycan is quite asymmetric. Extensive branched density is observed in chain A, assigned as 
Man7GlcNAc2. On the other hand, poor electron density is found in chain B corresponding to three 
reducing terminal saccharide residues (ManiGlcNAc2). The overall structure of the high-mannose 
glycan is similar to that of a complex type. In Figure 4a, there is a structural comparison between 
high-mannose type (chain A of 2wah) and complex-type glycans (PDB code; 2dts). The y\ an d /2 
angles of the side chain of Asn297 are essentially identical to those of other Fc fragments possessing 
complex-type glycans. Moreover, the dihedral angles of the glycosidic linkages in a pentasaccharide 
Man3GlcNAc2 core of both glycans are also similar (Figure 4a). However, the al-6 branches (D2 and 
D3 arms) of the high-mannose type occupy different positions compared with complex-type glycans. 
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This is due to the difference between the al-6 and [31-2 glycosidic bonds. Structural superposition of 
two Fc fragments reveals that the attachment of high-mannose type A^-glycans opens the interdomain 
cavity between the Ch2 domains (Figure 4b). This structural difference between the two glycoforms 
might explain the evidence that recombinant monoclonal antibody with human oligomannose-type 
glycans display enhanced ADCC, together with reduced complement activation through Clq 
binding [55]. It should be noted that the expression construct of the recombinant Fc fragment lacks the 
cysteine residues which form a disulfide bond in the hinge region. Further studies are needed to fully 
reveal the relationship between glycoform and structure. 

Figure 4. (a) Structural comparison between high-mannose glycan (PDB code; 2wah, 
green) and complex-type glycan (PDB code; 2dts, cyan), (b) Structural superposition 
between high-mannose type Fc fragment (green) and complex-type Fc fragment (cyan). 
Protein molecules and carbohydrate chains are shown in wire and stick models, respectively. 
The positions of Asn297 are indicated by red asterisks. Structural superposition of crystal 
structures were performed by the program SUPERPOSE [56]. 




2.1.3. Intra-molecular Carbohydrate-Carbohydrate Interaction 

Direct Ch2-Ch2 interaction is accomplished mainly through the pair of 7V-glycans at Asn297. To 
analyze the intramolecular carbohydrate-carbohydrate interaction modes of Fc fragment, we compared 
10 uncomplexed Fc fragment structures from Table 1; wild type human IgGl Fc (PDB code; lfcl, 
lh3v, lh3y, 2dts, 3ave, and 3do3), mutated human IgGl Fc (PDB code; 3fjt), rat IgG2a Fc (PDB code; 
lilc), mouse IgG2b Fc (PDB code; 2rgs), and rabbit IgG Fc (PDB code; 2vuo). Structural 
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superimposition among these 10 Fc structures reveals that the interdomain angles of the Ch2-Ch3 
domains are highly variable (Figure 5a). The highly mobile domain angles result in a variety of 
carbohydrate-carbohydrate interaction modes as described below. The modes are classified into six 
types (i~vi), and a schematic representation is shown in Figure 5b. (i) In four of 10 Fc fragments (PDB 
code; 2dts, 3fjt, 3ave, and 3do3), only one hydrogen bond is found between the OH4 hydroxyl group 
of each al-3 linked Man residue (Man-4) (Figure 5c). (ii) In the first reported human IgGl Fc 
fragment structure (PDB code; lfcl), the carbohydrate-carbohydrate interaction mode is slightly 
different. The OH4 hydroxyl group of the al-3 linked Man (Man-4) asymmetrically interacts with both 
OH3 and OH4 of the counterpart al-3 linked Man (Man-4) (Figure 5d). (iii) In two structures of 
human IgGl Fc fragment (PDB code; lh3v and 2vuo), the distances between the two OH4 hydroxyl 
groups of the two al-3 linked Man (Man-4) residues are rather long for a direct hydrogen bond (Figure 
5e). In (i) ~ (iii), the Fc structures superimpose well, whereas the remaining three Fc structures (PDB 
code; lh3y, lilc, and 2rgs) are unique, (iv) The structure of the human IgGl Fc fragment (PDB code; 
lh3y) is completely different compared with other human wild type IgGl Fc fragments (see magenta 
in Figure 5 a). In this structure, one weak hydrogen bond is observed between 07 of GlcNAc in the 
al-3 arm (GlcNAc-5) and OH6 hydroxyl group of the al-6 linked Man (Man-4') at a distance of 3.2 A 
(Figure 5f). The human IgGl Fc fragment (lh3y) was crystallized in a high salt concentration (2.0 M 
NaCl) while the other human wild type IgGl Fc fragments were crystallized under lower ionic strength 
conditions (lfcl; 30 mM sodium chloride, lh3v and 2dts; distilled water, 3ave; 20% butanediol, and 
3do3; 0.2 M sodium chloride and 20% polyethylene glycol 3,350). The range of ionic strength 
conditions may contribute to the conformational differences (v) The interaction mode observed in rat 
IgG2a Fc (PDB code; lilc) is asymmetric. The al-3 arm of one side contacts the core Fuc and the 
chitobiose of the other (Figure 5g). The 05 and OH6 of GlcNAc in the al-3 arm (GlcNAc-5) interact 
with OH4 of the Fuc residue, whereas OH4 of the al-3 linked Man (Man-4) makes hydrogen bonds 
with the hydroxyl groups of the chitobiose. (vi) In the crystal structure of mouse IgG2b Fc fragment 
(PDB code; 2rgs, [40]), four symmetric hydrogen bonds are found between the two oligosaccharide 
chains. The OH3 of the al-3 linked Man (Man-4) makes hydrogen bonds with OH2 of its counterpart 
p-Man (Man-3), and likewise the OH4 with OH6 (Figure 5h). 

The interdomain angle of the Ch2-Ch3 domains could be affected by the crystallization conditions, 
especially ionic strength [31]. Crystallographic analysis only provides static snapshots and gives no 
direct information on the dynamics of a glycosidic linkage or of polypeptide fluctuations. Thus, it is 
likely that the single static linkage conformation observed in a crystal does not directly reflect the 
average solution conformation. Nonetheless, the average conformation for a given linkage within a 
large set of static structures is likely to correspond well to the average solution conformation, and the 
distribution of static structures will give an indication of the flexibility of the linkage, as long as crystal 
packing forces do not impose systematic changes. 
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Figure 5. (a) Structural superposition of 10 Fc fragment structures (lfcl; green, lh3v; 
cyan, lh3y; magenta, lilc; yellow, 2dts; pink, lilc; yellow, 2rgs; wheat, 2vuo; slate, 3ave; 
orange, 3do3; lime, 3fjt; deep teal). For four entries (PDB code; lh3w, 3c2s, 2qll, and 
116x), the asymmetric units of these crystals contain only one heavy chain. Thus, the 
symmetry-related neighboring heavy chains were compensated for in this analysis. 
Structural superposition was performed by SUPERPOSE, (b) Schematic representation of 
six types of carbohydrate-carbohydrate interaction modes. A^-glycans of two chains are 
shown in blue and pink. Hydrogen bonds are shown as red lines, (c)-(h) Close-up views of 
the interfaces of carbohydrate-carbohydrate interactions. Carbohydrate moiety is shown in 
the rod model. Hydrogen bonds between carbohydrates are shown as red dotted lines. 
The structural superimposition of four structures which have only one 
carbohydrate-carbohydrate interaction is shown in (c). Human IgGl Fc fragment (PDB 
code; lfcl) is shown in (d). The superimposition of two structures which have no 
interaction between glycans is shown in (e). Human IgGl Fc in high salt condition (PDB 
code; lh3y), rat IgG2a (PDB code; lilc), and mouse IgG2b Fc fragment (PDB code; 2rgs) 
are shown in (f), (g), and (h), respectively. 
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2.1.4. Carbohydrate- Assisted Intermolecular Interaction (Neonatal Fc Receptor and Fey Receptor Ilia) 

Carbohydrates attached to glycoproteins are often required for tight binding to partner proteins. 
In this section, two topics are introduced in which TV-linked carbohydrates on receptors contribute to 
full Fc binding activity. 

The neonatal Fc receptor (FcRn) transports maternal immunoglobulin G (IgG) across the neonatal 
intestine in rodents and across the placenta in humans, thereby conferring humoral immunity to the 
fetus or newborn against antigens encountered by the mother. FcRn binds IgG with nanomolar affinity 
at acidic pH (<6.5) in the intracellular transport vesicles and releases IgG upon encountering the basic 
pH of the bloodstream (7.4). FcRn is a heterodimer and is composed of a soluble light chain 
[32-microglobulin and a membrane -bound heavy chain that includes three extracellular domains 
(al~3), a single pass transmembrane domain, and a short cytoplasmic domain. FcRn interacts with the 
Ch2-Ch3 domain interface on each chain of the Fc homodimer [57]. The 2:2 interaction mode creates 
higher ordered structures, called "oligomeric ribbons", and prohibits the growth of well-ordered 
co-crystals. To improve the crystal quality, Martin and co-workers designed a heterodimeric version of 
Fc that cannot bridge between FcRn molecules, since it contains only a single FcRn binding site [58], 
and solved the crystal structure of a FcRn ectodomain in complex with the heterodimeric Fc fragment 
at 2.8-A resolution [18]. The heavy chain (a2) and [32-microglobulin domains of FcRn interact with the 
Fc Ch2-Ch3 interface (Figure 6a). The binding interface between FcRn and Fc spans a large surface 
area (buried surface area up to 1,870 A 2 ) and is highly complementary. The complex is stabilized by 
extensive hydrophobic and electrostatic interactions. Interestingly, the complex-type TV-linked glycan 
attached on Asnl28 of FcRn has a highly ordered structure and contributes 10-15% of the total buried 
surface area in the interface. The core Fuc residue and GlcNAc of the al-3 arm (GlcNAc-5) tightly 
interact with the complementary surface of the binding partner (Figure 6b). This suggests that 
complex-type glycan, but not high-mannose type glycan, on FcRn is required for maximal Fc binding 
affinity. Actually, differential glycosylation of mouse FcRn could affect the receptor/ligand 
stoichiometry under non-equilibrium conditions [59]. 

The crystal structure of the Fc-FcRn complex reveals how the iV-glycan on FcRn affects the binding 
affinity. In contrast, the iV-glycan of IgG Fc strongly influences the interaction between IgG Fc and 
FcyR, and therapeutic antibodies may be modulated by selecting the appropriate glycoform. For 
example, removal of Fc core fucose selectively and significantly increases binding affinity to FcyRIII, 
thereby leading to enhanced cellular immune effector functions, such as ADCC. This may be 
especially relevant with respect to therapeutic anticancer antibodies [60,61], and has been the focus of 
research during the past decade. A crystal structure of glycosylated human IgGl Fc fragment in 
complex with the unglycosylated extracellular domain of FcyRIII was first reported by Sondermann 
and co-workers [45]. One FcyRIII binds to the two halves of the Fc fragment and contacts residues in 
the Ch2 domains and the hinge region. Complex formation significantly increases the angle between 
the two soluble FcyRIII domains and the Fc fragment is asymmetrically open. Recently, crystal 
structures of afucosylated human Fc in complex with glycosylated FcyRIIIa ectodomain were solved 
by two independent groups (Figure 6c) [46,47]. Although both groups produced the glycosylated 
FcyRIIIa ectodomain using mammalian expression systems, the glycan structures attached on Asnl62 
differ. Ferrara and colleagues report that the iV-glycan structure is high-mannose type, while 
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Mizushima et al. find asialylated complex type. The overall fold of the Fc-FcyRIIIa complexes where 
both proteins are glycosylated is very similar to that of the complexes where only the Fc protein is 
glycosylated. Clear electron density was obtained for both the Asnl62-linked glycan of the receptor 
and the glycans linked to the Fc fragment. The carbohydrate attached on Asnl62 shares a large 
interaction surface area (approximately 12% of the total interface area —145 A 2 — in the case of PDB 
code; 3ay4) with the Fc formed by various polar, van der Waals, and hydrogen bond interactions. 
The receptor Asnl62-carbohydrate interactions center on the Asn297-carbohydrate core of Fc chain A 
and its immediate vicinity (Figure 6d). Overall, a combination of direct or water-mediated 
carbohydrate-carbohydrate and carbohydrate-protein contacts are observed as part of the newly formed 
interaction between afucosylated Fc and the Asnl62-glycosylated receptor. 

Ferrara and colleagues also solved the crystal structure of fucosylated Fc in complex with 
glycosylated FcyRIIIa ectodomain. The core fucose linked to Fc is oriented towards the second 
GlcNAc (GlcNAc-2) of the chitobiose connected to Asnl62 of FcyRIIIa and has to be accommodated 
in the interface between the interacting glycan chains. This steric rearrangement causes the movement 
of the whole oligosaccharide attached on Asnl62 up to a maximum distance of 2.6 A while almost no 
movement is observed in the case of afucosylated Fc. This rearrangement of the interaction network 
reduces the enthalpy contribution in the fucosylated Fc complex. It is noteworthy that even such subtle 
displacement of carbohydrate chains affects physiological activity, such as in ADCC [46]. 

Figure 6. (a) Overall structure of neonatal Fc receptor (FcRn) in complex with heterodimeric 
Fc (hdFc) (PDB code; lila). Heavy chain and soluble light chain [32-microglobulin ([32m) 
of FcRn are shown in slate and cyan, respectively. Proximal and distal Fc fragments of 
hdFc are shown in pink and white, respectively. The region delineated in black dotted lines 
is magnified in (b). (b) Close-up view of FcRn-hdFc complex. A^-glycan attached at 
Asnl28 of FcRn is shown in rod and semitransparent sphere model, (c) Overall structure of 
human Fc-glycosylated human Fey receptor Ilia (FcyRIIIa) complex (PDB code; 3sgk). 
Two chains of Fc fragment and FcyRIIIa are shown in green, cyan, and yellow, 
respectively. The region delineated in black dotted lines is magnified in (d). (d) Close-up 
view of carbohydrate-carbohydrate interaction in Fc-FcyRIIIa. Hydrogen bonds are shown 
as red dotted lines. 
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Figure 6. Cont. 




2.2. High-Mannose Type Glycan on Group 2 Influenza Virus Neuraminidase 

Influenza virus infection has been a major threat to public health throughout the world for centuries. 
Influenza types A and B are enveloped RNA viruses carrying two glycoproteins on their surface, 
hemagglutinin (HA) and neuraminidase (NA, acylneuraminyl hydrolase, EC 3.2.1.18). Influenza NA 
removes terminal a2-3 or a2-6 linked sialic acid residues from carbohydrate moieties on cell surface 
glycoconjugates and is thought to thereby facilitate virus release and infection of another cell. 
Inhibition of NA delays the release of progeny virions from the surface of infected cells [62], 
suppressing the viral population, thus allowing time for the host immune system to eliminate the virus. 

Antigenic differences are used to classify influenza type A viruses into nine NA (N1-N9) 
subtypes [63]. Phylogenetically, there are two groups of NAs: group 1 contains Nl, N4, N5 and N8, 
and group 2 contains N2, N3, N6, N7 and N9 [64]. 

In both influenza A and B, functional NA is a tetramer of identical subunits with four-fold 
rotational symmetry. The NA tetramer forms a box-like head on top of a long stalk domain and is 
anchored in the viral membrane by a hydrophobic sequence near the N-terminus [65,66]. The surface 
of an influenza virus typically has about 50 tetrameric NA spikes [67]. NA has been targeted in 
structure -based enzyme inhibitor design programs that have resulted in the production of two drugs, 
zanamivir (Relenza) [68] and oseltamivir (Tamiflu) [69], that mimic the transition state of the normal 
enzyme reaction. 
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Figure 7. High-mannose type glycan of influenza neuraminidase assists tetramer 
formation, (a) Overall structure of monomeric influenza N2 neuraminidase 
(PDB code; lnn2). Protein, carbohydrate, and calcium ion are shown in ribbon, stick, and 
sphere models, respectively, (b) Tetrameric structure of influenza N2 neuraminidase (PDB 
code; lnn2). iV-linked glycans at Asn200 are shown in sphere models. The region 
delineated in black dotted lines is magnified in (c). (c) Close-up view of iV-glycan at 
Asn200 and symmetry related molecule. Hydrogen bonds are shown in red dotted lines, (d) 
The side-chain torsion angles of Asn200 of N2, Asn207 of N6, and Asn200 of N9 NA 
(Asn201 in PDB code; 2b8h). (e) Amino acid sequence alignment of group 1 and 2 
influenza neuraminidase around Asn200 glycosylation sites. Putative TV-linked 
glycosylation sites in group 2 are highlighted. 




N3 SSSSCTOKEramOITGNDNDASAOI IYAGRICTDSIKSIKRDIUITQESECQCIDGTCV 

N6 SSISCB»ISIMSIfUS(*W«ASAVV»TOORF^IEl.ASIAONIUttgESECVCW«iICP 

N7 SSISCWKVGBITICIOGNNWAIAIVVVNRRLIIIIKIlAKIilLRlgiSECVa'NCICA 

l» SSISCHK,R\R1ISICIS(^N!IASAVI1TNRRP\IEINT1ARNIU!II3ESEOCHNOVCP 



Crystallographic analysis of influenza NA has a history of almost 30 years. The first crystal 
structure of influenza N2 NA was reported in 1983 [70], and the first N9 in 1987 [71]. The NA monomer 
consists of a symmetric six-bladed P-propeller arrangement stabilized in part by calcium ions bound on 
the symmetry axis (Figure 7a). There are five iV-glycosylation sequons exposed on the protein surface, 
namely Asn86, Asnl46, Asn200, Asn234, and Asn402, the first four of which are assigned in the 
model. The two A'-glycans at Asnl46 and Asn200 are highly ordered. The glycan at Asnl46 is of the 
core-fucosylated complex type, whereas that at Asn200 is of the high-mannose type. The catalytic 
active site is in the middle of the P-propeller. In the tetrameric enzyme, each active site is directed 
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sideways rather than upwards, an orientation consistent with the enzyme having to cleave off sialic 
acid from nearby membrane proteins to avoid virus trapping (Figure 7b). The tetramer is approximately 
100 x 100 x 60 A in size with a large hole underneath around the 4-fold axis. The high-mannose type 
glycan at Asn200 is located in the rim of blade 2 and bridges the right-handed neighboring subunit, 
contributing to the intersubunit interactions in the tetramer (Figure 7b). A close up view of the iV-glycan 
attached at Asn200 and neighboring molecules is shown in Figure 7c. The TV-glycan lies over blade 5 
of a neighboring molecule. A [3-Man (Man-3) residue of this glycan is buried in the neighboring 
molecule. The accessible surface area of this residue is calculated as -19 A 2 by AREAIMOL [72]. The 
crystal structures of N2, N6 and N9 NA in group 2 have been determined and are summarized in 
Table 2. The /V-glycan attached at Asn200 in N2 corresponds to those of Asn207 in N6 and Asn200 in 
N9 and their structures are well superimposed on each other. The distribution of the torsion angles of 
the side chain of Asn200 is shown in Figure 7d. The torsion angles mainly assume -240°, which is 
normally rare in both glycosylated and non-glycosylated Asn side chain conformers [48] (Figure 7d). 
Strong interaction between the TV-glycan and the neighboring subunit may stabilize such a conformation 
of Asn200. Amino acid sequence alignment reveals the sequon at this site to be conserved among 
group 2 NAs except for N3 (Figure 7e). Structural comparison between group 2 and other NAs (Nl, 
N4, N8 and type B NAs) reveal no obvious common structural feature [64,73]. Viral proteins are 
synthesized and secreted by host cells. Thus, the glycans attached on viral proteins are also processed 
by host glycosyl-hydrolases and -transferases. Since glycan structures of Asn200 are of the 
high-mannose type, group 2 NA likely forms a tetrameric structure before glycan processing. 



Table 2. Summary of group 2 influenza neuraminidase structure. 



PDB ID 


Glycan structure * 


Resolution 


Reference 


N2 (A/Tokyo/3/1967) 








lnn2 


M 4 GN 2 


2.20 


[74] 


N2 (A/Tokvo/3/1967) 








linw 


M3GN2 


2.40 


[75] 


linx 


M3GN2 


2.40 




N6 (A/swine/KU/2/200n 








lvOz 


M 3 . 5 GN 2 


1.84 


[76] 


lwlx 


M1.5GN2 


2.00 




lw20 


M!. 6 GN 2 


2.08 




lw21 


Ml 6 GN 2 


2.08 




2cml 


M 6 GN 2 


2.15 




N9 (A/Tern/Australia/G70C/75) 






liny 


M5GN2 


2.40 


[75] 


N9 (A/Tem/Australia/G70C/1975 (H11N9)) 






1Kb 


M5GN2 


1.80 


[77] 


lf8c 


M5GN2 


1.70 




lf8d 


M5GN2 


1.40 




lf8e 


M5GN2 


1.40 
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Table 2. Cont. 



PDB ID Glycan structure * Resolution 



Reference 



N9 (A/Tern/Australia/G70C/1975) in complex with single chain Fv fragment 
lal4 M 5 GN 2 2.50 

N9 (A/NWS/whale/Maine/1/84) 

2b8h M 7 . 8 GN 2 2.20 



[78] 



[79] 



* Glycan structures deposited in each PDB coordinate file are shown. M: D-mannose, GN: 
7V-acetyl-D-glucosamine. 

3. Immature High-Mannose Type Glycans Contribute to Inter-Subunit and Inter-Domain 
Interactions 

Initial processing of glycoproteins takes the form of deglucosylation of high-mannose type glycan 
in the ER, and is conserved among all eukaryotes [80]. It is generally considered to be the major event 
to signal the completion of protein folding. Mono- or di-glucosylated A^-glycans are rarely observed in 
mature glycoproteins. However, glucosylated iV-glycans have been detected in several secreted proteins. 
Here, we introduce two structures which possess immature glucosylated high-mannose type iV-glycans 
on their surfaces. In both cases, the immature glycan extensively interacts with the surface of the protein. 
Investigation of the co-existence of both unprocessed and processed glycans on a single polypeptide 
may help to unravel the relationship between protein folding and glycan maturation. 

3.1. Monoglucosylated High-Mannose Type Glycan Stabilizes Hexamer Formation of Arylphorin from 
Antheraea pernyi 

Our first example is arylphorin from the Chinese oak silkworm, Antheraea pernyi (abbreviated as 
APA hereafter), which is a hexameric protein of 688 amino acid residues per subunit [81]. It is a 
hexamerin, a group of proteins belonging to a superfamily that includes arthropod tyrosinase, 
arthropod hemocyanin, and dipteran arylphorin receptor [82]. The hexamerins show clear structural 
similarities with the hemocyanins, but have lost the ability to bind copper ions and transport oxygen. 
They are synthesized in the fat body of a wide range of lepidopteran and dipteran larvae, among other 
insect orders. These proteins accumulate to high concentrations in the hemolymph. Hexamerins appear 
to serve as a storage form of amino acids, a resource required for complete development of the adult, 
since insect pupae do not feed during metamorphosis. In addition to being a storage protein, 
hexamerins appear to play other important roles during the lifespan of insects. There are at least two 
types of hexamerins in Lepidoptera: arylphorin and a methionine-rich storage protein [83]. 

APA has two iV-glycans attached at Asnl96 and Asn344, although APA possesses four possible 
candidate sequons. Glycosylation of Asn344 is critical for the folding process, whereas glycosylation 
of Asnl96 is not. Mass spectrometric analysis revealed that N344-glycan is a trimmed high-mannose 
type (Man 5 . 6 GlcNAc2), whereas the N196-glycan remains in a monoglucosylated Af-glycan 
(GlciMan 9 GlcNAc2) state and is resistant to peptide iV-glycosidase F (PNGaseF) treatment [84]. 
Although the recombinant N344Q mutant protein is not secreted in culture medium, the Asnl96Gln 
mutant protein is expressed as in wild type and has the same ecdysone-binding activity as wild-type. 
The crystal structure of APA was solved at 2.42 A resolution. The overall structure of APA is similar 
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to that of lobster hemocyanin and is composed of an N-terminal all a-helical fold and a C-terminal 
P-sandwich like fold (Figure 8a). The A^-glycan at Asn344 is exposed to solvent and only the 
chitobiose portion is assigned. In contrast, the iV-glycan at Asnl96 has a clear electron density map and 
is assigned as a monoglucosylated structure. The asymmetric unit of the APA crystal contains six 
monomers (one hexamer) as shown in Figure 8b. While all iV-glycan chains at Asn344 are completely 
exposed to solvent in the hexamer, the iV-glycans at Asnl96 are buried inside the hexamer and well 
organized in the deep cleft of the subunit interface. A comparison between monomeric and hexameric 
APA revealed that the Dl arms of the iV-glycans are buried inside during hexamer formation. Actually, 
the accessible surface areas of a Dl arm in monomeric and hexameric APA are 370, and 195 A , 
respectively. The A^-glycan forms about 20 direct water-mediated hydrogen bonds with adjacent amino 
acid residues, which are located in the same or different subunits. Typical dihedral (p and yj angles at 
Manal-2Man are -60° and -150°, respectively. In contrast, the dihedral (j) and iff angles at 
Manal-2Man in the D2 arm (Man-D2-Man-A) are 279° and 130°, respectively. The D2 arm of the 
glycan adopts a unique conformation to accommodate the curvature formed by the blue and yellow 
molecules (Figure 8c). Indeed, the accessible surface of area of al-2 linked Man in the D2 arm 
(Man-D2) is dramatically different in monomeric (217 A 2 ) and hexameric (86 A 2 ) APA structures. 
In addition to the inter-subunit disulfide bond between Cys73 and Cys649, extensive intermolecular 
interactions between //-glycan and APA also stabilize the overall trimer-trimer interaction by 
enhancing interaction between each top and bottom dimer (Figure 8b). Analytical ultracentrifugation 
and guanidinium chloride unfolding experiments revealed that the presence of the N196-glycan is 
important for stabilizing the hexameric state and overall stability of APA. 

Figure 8. Immature monoglucosylated iV-glycan on Antheraea pernyi arylphorin (PDB 
code; 3gwj) (a) Overall structure of monomeric APA. Protein and carbohydrate are shown 
in ribbon and rod models, respectively, (b) Hexameric structure of APA. Each monomer is 
shown in surface model. The attached TV-linked glycans at Asnl96 are shown in spheres, 
(c) Close-up view of Asnl96-attached iV-glycan in hexameric APA. 



(a) 
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Figure 8. Cont. 




3.2. Diglucosylated^H-Glycan Stabilizes Inter-Domain Interaction of fi-Dalactosidase from 
Trichoderma reesei 

P-galactosidase is an enzyme (E.C. 3.2.1.23) that catalyzes the hydrolysis of pi-3 or [31-4 linked 
Gal residues in oligo- and disaccharides, such as lactose, galactobiose, aryl- and alkyl-P-D-galactosides. 
This enzyme also has the ability to catalyze the reverse reaction of the hydrolysis called 
transglycosylation. [3-galactosidases have been isolated from various sources, such as animals, plants, 
bacteria, yeasts and fungi. They have many important applications in the industrial and biotechnological 
fields. In the CAZy database [85], the [3-galactosidases occur in the GH 1, 2, 35, and 42 subfamilies. 

Trichoderma reesei P-galactosidase (Tr-[3-gal) belongs to the GH 35 subfamily. The sequence and 
the enzymatic properties of this industrially useful enzyme have previously been reported [86,87]. 
Crystal structures of Trichoderma reesei P-galactosidase (Tr-P-gal) in unliganded and ligand complexes 
were solved at 1.2-1.75 A resolutions (PDB code; 3og2, 3ogr, 3ogs and 3ogv [88]). The overall structure 
of Tr-P-gal consists of six domains (Figure 9a). The N-terminal domain forms an eight-stranded a/p 
barrel structure and is responsible for the catalytic reaction. The subsequent five domains form 
anti-parallel P-sandwich structures. Tr-P-gal possesses 1 1 putative iV-linked glycosylation sites on the 
surface of the protein [87]. Ten of the 11 sites are exposed (Asn810 is buried). Electron density maps 
corresponding to the carbohydrate moieties of the iV-glycans are observed at five of the positions. Two 
of these (Asn627 and Asn930) contain oligosaccharide chains that represent high-mannose glycan 
forms. One of these positions (Asn930) has diglucosylated high-mannose type glycan of the form 
Glc2Man 8 GlcNAc2 (PDB code; 3og2). The glycan at Asn930 is located between three different 
domains (first, fifth, and sixth domains) and makes a number of hydrogen bonds with the protein 
surface, stabilizing the structure of Tr-P-gal (Figure 9b). The carbonyl oxygen from Ile955 interacts 
with the chitobiose core. The side chain of Asp776 bridges P-Man (Man-3) and Glc residues. The 
glycan at Asn930 is also connected to the catalytic domain because OD1 and OD2 from Asp265 
tightly interact with an a 1-2 linked Man (Man-C). This glycan covers several hydrophobic and 
aromatic amino acid residues and may protect the structure from proteolysis. The P-Man (Man-3) is 
the most buried residue and reaches to the protein cavity (accessible surface area only -36 A 2 ). The 
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glucose units of Tr-^-gal reach very close to the catalytic site. Thus, it is plausible that glycosylation 
affects the catalytic properties of the enzyme. 

Recombinant Tr-[3-gal was overexpressed by a Trichoderma reesei expression system [89]. 
In fungal expression systems, Glc residues are expected to be trimmed by glucosidases I (GLS-I) and 
II (GLS-II). Nevertheless the iV-glycan attached at Asn930 in Tr-[3-gal is di-glucosylated and interacts 
with residues far apart in the primary sequence. It suggests that the protein folding of Tr-^-gal might 
be finished before GLS-II encounters the terminal Glc residues on the glycan. The tight association 
between the Dl arm and the protein surface likely prevents access of the glucose units to the catalytic 
site of GLS-II, inhibiting deglucosylation and further conversion to complex-type glycans. 

Figure 9. Immature diglucosylated iV-glycan on |3-galactosidase from Trichoderma reesei 

(a) Overall structure of Tr-^-gal (PDB code; 3ogv). TV-linked glycan at Asn627 and Asn930 
are shown in sphere models. The region delineated in black dotted lines is magnified in (b). 

(b) Close-up view of Asn930 attached glycan. Hydrogen bonds are shown as red 
dotted lines. 



(a) (b) 




4. What Is the Function of Mobile/Disordered N-Glycans? 

Cell surface membrane proteins, such as cell surface receptors, are often glycosylated. Crystal 
structures of these cell surface glycoproteins revealed that their iV-glycans are usually mobile and most 
of the sugars disordered. Although surface iV-glycans are sometimes immobilized by symmetry-related 
molecules in crystal packing, only the first one or two GlcNAc residues are usually ordered enough to 
be traced in the electron density map. The alteration of glycoform is associated with various 
physiological and pathological events, including tumor invasion [90]. These disordered or "missing" 
glycans are therefore considered to assume certain physiological functions as with the highly-ordered 
iV-glycans. In this chapter, we introduce several structures of cell surface glycoproteins which play a 
role in the immune system and cell-cell adhesion. 
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Infectious diseases caused by various pathogens account for about one -third of all human deaths in 
the world, more than all forms of cancer combined [91]. To fight against these powerful pathogens, 
vertebrates use two types of immune defense carried out by specialized proteins and cells; the innate 
immune response and the adaptive immune response. Innate immunity is based on an ancient and 
ubiquitous system of cells and molecules that defend the host against infection. This system can 
recognize virtually all microbes using a limited repertoire of germ-line-encoded receptors that 
recognize broadly conserved components of bacterial and fungal cell walls or genetic material, such as 
double-stranded viral RNA (dsRNA) [92,93]. Toll-like receptors (TLRs) are the most important 
sensors in the innate immune system [94]. Ten human TLRs (TLR1-10) which specifically recognize 
pathogen-associated molecules have been identified. Human TLR3 is activated by dsRNA associated 
with viral infection, endogenous cellular mRNA, and sequence-independent small interfering RNAs. 
The human TLR3 ectodomain is a large horseshoe-shaped solenoid-like structure assembled from 23 
leucine-rich repeats (PDB code; lziw, [95]). Human TLR3 ectodomain possesses 15 potential 
iV-glycosylation sites. Due to the poor electron density of the carbohydrate moieties, only one or two 
GlcNAc residues are assigned at eight of the iV-glycosylation sites. A putative fully glycosylated model 
structure reveals that almost all the surface of the TLR3 molecule is covered by carbohydrate, but one 
face is glycosylation-free. The structure of mouse TLR3 ectodomain in complex with dsRNA 
demonstrates that dsRNA contacts occur through residues on the glycosylation-free surface (PDB 
code; 3ciy, [96]). The glycosylation sites of both human and mouse TLR3 are almost identical. 
Compared with human TLR3, the electron density of carbohydrate is clearly observed in mouse TLR3. 
iV-glycan at Asn413, located in the vicinity of dsRNA, is assigned as Man 3 GlcNAc 2 and an al-6 linked 
Man residue interacts with the sugar-phosphate backbone of dsRNA. As in the case of human TLR3, 
the location of iV-glycan is thought to be useful in order to restrict the preferential orientation of ligand 
(Figure 10a). 

As for adaptive immune systems, most of the cell surface receptors which are involved in antigen 
recognition by T cells and in the orchestration of the subsequent cell signaling events are glycosylated. 
Rudd et al. postulated the iV-glycans on the protein surface play a wide range of roles, such as 
controlling the assembly and stabilization of the protein complexes in the adaptive immune system [97]. 
TV-linked glycans attached on the membrane proximal domains of CD2 (PDB code; lhnf, [98]) and 
CD48 (PDB code; 2dru, [99]) are distributed so as to provide a scaffold to orient the binding faces, 
which leads to increased apparent affinity (Figure 10b). Moreover, the glycans on T-cell receptors 
(TCR) are located over the protein surface in such a way that they could prevent non-specific 
aggregation. Another important point to be emphasized is that iV-glycans limit the possible geometry 
and spacing of TCR/major histocompatibility complex (MHC) clusters which precede cell signaling. 

Cells produce, organize and degrade extracellular matrix. The matrix in turn exerts a powerful 
influence on the cells, mainly through "matrix receptors". Integrins are the principle matrix receptors 
on animal cells and transmit bidirectional signals across the plasma membrane, thereby linking the 
extracellular environment to the internal actin cytoskeleton [100]. All integrins are non-covalently 
linked heterodimeric molecules consisting of one a and one [3 subunit, which together create a binding 
site for specific extracellular ligands on that part of the protein furthest from the membrane. a5[31 
integrin is a major cellular receptor for the extracellular matrix protein fibronectin and plays a 
fundamental role during mammalian development. Fibronectin is a principle component of the 
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extracellular matrix and is a modular protein composed of homologous repeats of small domains with 
an elongated shape arranged as "beads on a string". a5[31 integrin interacts with fibronectin through 
Arg-Gly-Asp (RGD) sequences present in a flexible loop region in the middle of the protein. A crystal 
structure of the a5[31 integrin ectodomain shows the fibronectin-binding pocket surrounded by four 
A^-glycans (two in a5 and the other two in [31), forming a trench-like exposed surface along the subunit 
interface. This topography and location of the A^-glycans presumably limits the choice of docking 
orientations when the elongated fibronectin molecule tries to make close contact (Docking model 
based on a5[31 integrin ectodomain (PDB code; 3vi4) and fibronectin fragment (PDB code; 2mfn and 
lfhf), [101,102] (Figure 10c). 

Figure 10. Highly flexible A'-glycans on cell surface receptors. Complex-type iV-glycans 
(GlcNAc2Man 3 GlcNAc2Fuc) are superimposed, based on the position of chitobiose or 
sequons by using LSQKAB [103]. (a) Fully glycosylated Toll-like receptor-3 (TLR3) 
ectodomain in complex with dsRNA (PDB code; 3ciy). Protein molecules are shown as 
green and cyan surface models. dsRNA is shown as a gray sphere, (b) Extracellular 
domains of CD2 (PDB code; lhnf) and CD48 (PDB code; 2dru). (c) Crystal structure of 
a5[31 integrin ectodomain (PDB code; 3vi4) and fibronectin FN7-10 fragment (PDB code; 
lfhf). In the fibronectin structure, the amino acid residues which interact with a5|31 integrin 
are shown in red stick model. Dashed lines on a5[31 integrin outline the shallow groove 
formed by iV-glycans. (d) Intercellular cell adhesion molecule (ICAM)-2 ectodomains 
(PDB code; lzxq). 



Membrane 




The intercellular adhesion molecules 1-3 (ICAM-1-3) are members of the immunoglobulin 
superfamily (IgSF) and are known ligands for integrins on the surface of cells. ICAM-2, a ligand for 
aL[32 integrin, is composed of two TV-terminal extracellular Ig domains (which share 35% sequence 
homology with ICAM-1), a transmembrane domain, and a short cytoplasmic tail. The Ig domain, the 
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characteristic building block of IgSF, consists of two antiparallel [3-sheets packed tightly against each 
other and linked by a disulfide bond. The crystal structure of the two extracellular Ig domains of 
ICAM-2 (PDB code: lzxq) shows that they adopt a hockey stick shape. Six iV-glycans are exposed to 
solvent and assigned as chitobiose portions. When one looks down the axis of the two Ig domains, 
these iV-glycans are seen as uniformly distributed around the perimeter of the domains [104] 
(Figure lOd), and are well placed to prevent non-specific protein aggregation. 

These examples illustrate the important role of glycan in assisting the docking of large ligands, both 
by orienting the ligand and by spacing the receptor through inhibiting aggregation. It has been 
considered that a principle role of the mobile carbohydrate attached on cell surface receptors is to 
increase protein conformational stability [105]. Thus solution analysis of RNaseB revealed that the 
glycan has a significant stabilizing effect on the protein structure by decreasing the flexibility of the 
protein backbone both near to and distant from the glycan attachment site [106]. These secondary 
functions of glycans, supplementary to actual binding interactions, seem to be fundamental to the role 
of the carbohydrates in glycoproteins on cell surfaces. 

5. Future Perspective 

Several structural and functional aspects of glycosylation, in terms of intra- and inter- molecular 
interactions, are provided by available crystal structures. An emphasis on glycoprotein-oriented structural 
biology will further the understanding of glycan function. Crystallographic analysis of glycoproteins 
will require a more thorough investigation of the glycoprotein expression system and of glycan 
structure. Although methodologies for glycoprotein production are advancing [107,108], improvements 
are required in the expression and selection of a particular glycoform of a target protein. The glycan 
sequence at each glycosylation site can be analyzed in advance by mass spectrometry, coupled with 
liquid chromatography. However, even with this sequence information, the electron density of the 
glycan must be interpreted with great caution. Backbone flexibility dictates that structural information 
on the glycan is largely missing in X-ray diffraction data. Other techniques, such as molecular dynamic 
simulation and NMR, are needed to help join the discontinuous snapshots derived from X-ray studies 
and to evaluate the contribution of the glycan moieties to molecular fluctuations. 

The glycoform at a particular site on a protein is often closely related to physiological function and 
is tightly regulated [18,46,47,109]. However, in many cases the structural relationship between 
glycoform and function still remains unclear. Recent technical advances in the chemical and enzymatic 
syntheses of homogeneous glycoproteins are going to make further valuable contributions to the study 
of glycoform- function relationships [110-1 12]. A survey of current PDB data indicates that the glycan 
structures of available glycoproteins are mainly of the high-mannose type, and may be biased due to 
the expression system used. Thus the relationship between glycoform function and its 3D structure 
needs careful investigation. Structural biology focusing on glycoform-specific functions is the 
challenge for the near future. 
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